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INTRODUCTION 


SPARCS-2  (Simulation  Program  for  Assessing  the  Reliabilities  of  Complex 
Systems,  Version  2)  is  a PL/1  computer  program  for  assessing  (establishing 
interval  estimates  for)  the  reliability  and  the  MTBF  of  a large  and  complex 
system  of  any  modular  configuration.  The  system  can  consist  of  a complex 
logical  assembly  of  independently  failing  attribute  (binomial-Bemoulli) 
and  time-to-failure  (Poisson-exponential)  components,  without  any  regard  to 
their  placement.  Alternatively,  it  can  be  a configuration  of  independently 
failing  modules,  where  each  module  has  either  or  both  attribute  and  time- 
to-failure  components. 

The  raw  data  for  assessments  are  the  component  failure  history  data 
and  the  system  configuration.  The  historical  data  are  "successes  and 
failures"  for  binomial-Bemoulli  components  and  "failures  and  testing  time 
(normalized  to  'mission  equivalent  units')"  for  time-to-failure  components. 
The  configuration  data  consist  of  a list  or  lists  of  minimal  paths  ("minimal 
path  sets"  or  "tie  sets"),  or  else  a list  of  minimal  cuts  ("minimal  cut 
sets"),  for  the  system  as  a list  of  modules,  and  for  each  module  as  a list 
of  components.  If  the  MTBF  assessment  option  is  selected,  the  system 
"mission  time"  is  also  needed. 

The  underlying  mathematical  model,  identical  with  that  incorporated 
into  the  first  version  of  the  SPARCS  program  described  in  [5] , is  an  amal- 
gamation of  Boolean  logic,  probability,  and  Bayesian  and  Monte  Carlo 
techniques.  The  system  reliability,  a numerical-valued  function  of  the 
component  reliabilities,  is  derived  by  the  method  of  inclusion-exclusion 
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(IE),  also  known  as  Poincare’s  theorem,  from  the  minimal  paths  or  the 
minimal  cuts.  The  failure-history  data  are  "sufficient  statistics,"  for 
the  parameters  of  Bayesian  conjugate  prior  distributions  (c.p.d.’s)  on 
the  component  reliabilities,  "beta"  for  attributes  and  "negative-log  gamma" 
for  time  to  failure. 

SPARCS  assesses  by  Monte  Carlo.  At  each  trial,  for  each  component, 
a value  of  the  reliability  is  generated  from  the  c.p.d.  and  substituted 
into  the  system  function,  to  obtain  a value  of  the  system  reliability  for 
that  trial.  The  resulting  "eTnpirical,,  distribution  of  system  reliabilities, 
obtained  over  a series  of  trials,  provides  the  basis  for  an  assessment.  Per- 
centage points  on  that  distribution  are  interpreted  as  system  reliability 
confidence  limits.  The  corresponding  MTBF  confidence  limits  are  calculated, 
based  on  the  simple  relationship  between  the  reliability  and  the  MTBF. 

SPARCS  was  developed  at  Oklahoma  State  University  in  a series  of 
stages.  Initially  J.  L.  Burris  [1]  prepared  an  estimation  program  MAPS 
(Method  for  the  Analysis  of  the  Probabilities  of  Systems)  to  calculate  the 
system  reliability  as  an  exact  function  of  the  component  reliabilities, 
based  upon  system  logical  input  data.  MAPS  was  modelled  to  a great  extent 
after  a similar  system  programmed  in  FORTRAN  and  assembler  language  called 
SCOPE  (System  for  Computing  Operational  Probability  Equations)  [2,  3], 
which  had  been  developed  at  Rockwell  International  in  the  1960s  under  a 
NASA  contract.  Certain  significant  improvements  were  incorporated  into 
MAPS,  as  follows: 

1.  PL/1  programming:  this  made  it  possible  to  develop  a fully 
"portable"  program  package  with  efficient  binary  digit 
manipulation  capability,  in  a more  adaptable  higher  level 
language  than  FORTRAN. 
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2.  Modularity:  this  made  it  possible  to  generate  the  equation 
and  calculate  the  probability  for  a system  of  substantially 
larger  sizes  than  can  be  processed  with  SCOPE. 

3.  "Minimal  path"  or  else  "minimal  cut"  calculations:  by 
taking  advantage  of  the  dual  relationship  between  the  paths 
and  the  cuts,  the  same  software  can  be  used  either  to  ob- 
tain a system  reliability  function  of  the  component  relia- 
bilities from  the  minimal  paths,  or  else  an  unreliability 
function  of  the  component  unreliabilities  from  the  minimal 
cuts. 


Under  Contract  F33615-74-C-4072 , Cooley  [4]  programmed  the  initial 
version  of  the  SPARCS  system  for  system  reliability-  and  MTBF-confidence 
assessment.  The  same  logical  configuration  data  are  needed  as  for  MAPS. 
Instead  of  inputting  the  component  probabilities  as  in  MAPS,  however,  the 
component  failure-history  data  are  used,  for  both  attribute  and  time-to- 
failure  components.  The  attribute  data  are  accumulated  successes  and 
failures;  the  time-to-f ailure  data  are  accumulated  failures  and  testing 
time. 

The  initial  version  of  SPARCS,  as  documented  in  [4]  and  [5],  in  our 
opinion  represents  a landmark  and  a significant  state-of-the-art  improve- 
ment over  other  programs  designed  for  estimating  or  assessing  system  re- 
liability. The  results  of  an  assessment  can  be  verified  in  the  sense  that 
they  are  reproducible;  that  is,  one  can  input  either  the  minimal  paths  or 
else  the  minimal  cuts  and  obtain  a reasonably  similar  assessment  for  any 
given  set  of  input  data.  In  addition,  accurate  reproducible  assessments 
can  be  obtained  with  many  fewer  trials  (20-100)  than  by  older  Monte  Carlo 
programs,  which  sample  missions  rather  than  reliabilities  and  therefore 
require  many  thousands  of  trials. 

There  are  some  aspects  of  the  initial  version  of  SPARCS  which  made 
it  relatively  difficult  to  use.  The  most  significant  problem  in  this 
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regard  was  the  fact  that  in  order  to  generate  beta  and  gamma  deviates,  it 
was  necessary  to  set  up  calls  to  proprietary  IMSL  routines  programmed  in 
FORTRAN  and  assembler  languages.  This  is  a clear  interference  with  the 
portability  feature,  since  the  rest  of  the  program  is  PL/1  and  nonpro- 
prietary. It  also  made  the  JCL  language  input  more  complicated.  In  addi- 
tion, the  program  was  also  overly  long,  and  required  too  much  core  space 
and  running  time. 

Oklahoma  State  University  was  awarded  a second  contract,  F336 15-76- 
C-3094,  to  make  improvements  in  SPARCS,  as  well  as  to  perform  certain  re- 
lated studies,  such  as  error  analysis  of  the  random  deviate  generators 
and  model  validation.  In  the  process,  we  developed  a program,  now  called 
SP ARCS-2,  which  not  only  has  corrected  major  deficiencies  in  the  original 
version,  but  is  also  more  efficient  and  much  more  serviceable  from  the 
viewpoint  of  the  user. 

The  following  represent  the  improvements  in  SPARCS-2: 

1.  The  new  beta  and  gamma  generators,  CABTA  and  RGAMA  respec- 
tively, both  employing  a "rejection"  technique,  are  much 
faster  than  the  IMSL  routines  in  the  original  version. 

These  generators  are  described  in  greater  detail  in 
Division  2 of  this  report. 

2.  The  program  is  more  compact,  about  % the  original  size, 
and  makes  extensive  use  of  the  dynamic  storage  allocation 
feature  of  the  PL/1  language. 

3.  Core  requirements  are  less  than  in  the  original  version 
and  are  variable;  for  example,  very  small  problems  can 
take  about  100K. 

4.  There  is  an  improved  f,super  modularity11  capability.  Modules 
with  minimal-cut  unreliability  calculations  can  be  mixed 
with  those  having  minimal-path  reliability  calculations. 

However,  all  output  has  been  standardized  to  system  relia- 
bility or  "probability  of  success,"  regardless  of  the  form 
in  which  the  input  data  are  presented,  and  whatever  the 
configuration  of  modules  or  elements  within  modules. 
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5.  Larger  systems  can  be  handled,  and  limitations  are  more 
clearly  spelled  out*  A system  can  consist  of  a configura- 
tion of  up  to  128  modules  or  components,  with  up  to  256 
minimal  states.  Likewise,  a module  within  the  system  can 
have  128  components,  and  256  minimal  states*  A probability 
equation  can  have  up  to  3500  terms. 

6.  Internal  documentation  is  very  much  improved,  based  on 
"structured  programming"  concepts,  to  facilitate  the  prepa- 
ration of  input  data,  and  for  the  benefit  of  users  who  may 
have  to  make  alterations  in  capacity  or  in  some  of  the 
internal  procedures. 
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ABSTRACT 


This  paper  deals  with  the  logical  formulation  of  a system  for  purposes 
of  reliability  analyses  and  both  exact  and  approximate  methods  of  calculating 
the  system  reliability.  The  first  part  deals  with  the  logical  concepts,  and 
the  second  part  with  probability  calculations. 

The  logical  formulation  in  Part  1 starts  from  first  principles.  The  uni- 
versal set  U of  system  states  is  a "Boolean  algebra".  The  "power  set"  & is 
the  set  of  subsets  of  U.  The  subset  of  system  success  states  or  "paths"  is  a 
"lattice"  within  U,  also  an  element  of  represented  by  a Boolean  polynomial. 

The  terms  of  the  polynomial  are  monomials  which  give  the  indicators  for  the 
"sublattices"  or  "complete  subsets"  of  U within  the  lattice  of  paths.  The 
optimal  representation  of  this  lattice  polynomial  is  in  the  minimalized  form; 
the  terms  of  the  minimalized  lattice  of  paths  are  called  the  "minimal  paths". 
Similarly,  the  subset  of  system  failure  states  or  "cuts"  is  a lattice  polynomial 
whose  terms  are  sublattices  within  the  lattice  of  cuts;  the  terms  of  the  mini- 
malized lattice  polynomial  are  the  "minimal  cuts".  Logically  consistent  systems 
(also  known  as  "coherent  structures")  have  certain  properties  with  respect  to 
the  partial  ordering  of  paths  and  cuts.  The  concepts  of  Boolean  logic,  mini- 
malizatlon,  etc. , apply  to  systems  that  do  not  have  the  "consistency"  property, 
as  well  as  to  those  that  do  have  it. 

The  second  part  of  this  paper  deals  with  ways  of  using  a minimalized  lattice 
polynomial  to  derive  a numerical-valued  probability  function  from  which  to  cal- 
culate the  system  reliability:  also  computer  software,  error  bounds  and  ap- 
proximations. The  reliability  is  the  probability  that  the  actual  state  of  the 
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system  is  an  element  of  the  lattice  of  paths.  The  exact  probability  is  derived 
from  the  minimalized  lattice  polynomial  of  paths  by  the  method  of  inclusion- 
exclusion,  also  known  as  Poincare's  Theorem.  Dually  the  probability  of  failure, 
or  system  unreliability,  is  derived  from  the  minimalized  lattice  of  cuts.  Mod- 
ularization and/or  inversion  can  be  helpful  in  keeping  the  probability  function 
reasonably  small  in  size.  Computer  software  is  available  both  to  generate 
the  probability  polynomial  and  to  calculate  either  the  reliability  or  unrelia- 
bility. Approximations  can  be  used  based  upon  simplifying  assumptions  which 
delete  low  probability  terms  from  the  function.  Conservative  error  bounds  are 
obtainable  for  some  models.  The  approximation  techniques  include  serializing 
methods  ("single-point  failures"  and  "parts  count") , very  large  system  approxi- 
mations for  fault-tree  applications  and  the  Esary-Proschan  bounds . 
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INTRODUCTION 


System  reliability  analysis  is  closely  related  to  Boolean  logic.  From 
George  Boolefs  classic  1854  book.  The  Laws  of  Thought  [9],  it  can  be  seen 
that  the  kind  of  problem  Boole  was  trying  to  solve 

...  to  make  that  method  itself  the  basis  of  a general  method 
for  the  application  of  the  mathematical  doctrine  of  Probabili- 
ties ...  [9,  p.  1] 

...  it  is  always  possible  ...  to  express  the  event  whose 
probability  is  sought  as  a logical  function  of  the  events 
whose  probabilities  are  given  ...  [9,  p.  15] 

is  essentially  the  same  problem  that  a system  reliability  analyst  tries  to 
solve.  In  this  paper,  the  objective  is  logically  symbolizing  the  success 
or  failure  of  a system  as  a function  of  success  or  failure  for  various  com- 
binations of  its  components,  and  evaluating  the  probability  of  success,  that 
is,  the  system  reliability  as  a function  of  the  component  reliabilities. 

Boole  made  some  errors.  His  explanation  of  "inversion"  involved  a 
clumsy  use  of  series  expansions  and  divisions  by  logical  zero,  in  contrast 
to  the  more  elegant  De  Morgan’s  theorems.  Also,  he  did  not  exhibit  a deep 
understanding  of  "duality".  Subsequent  work  by  others,  based  on  Boole’s 
foundations,  led  to  many  significant  developments  in  mathematics,  symbolic 
logic  and  philosophy,  including  set  theory,  particularly  lattice  theory 
and  the  theory  of  partially  ordered  sets,  Boolean  algebra,  and  minimalization 
by  Quine  [46,  47,  48].  The  contributions  of  Boolean  logic  to  applied  disci- 
plines such  as  computer  science  and  electrical  engineering,  particularly 
through  the  pioneering  work  of  Shannon  [54],  are  well  known  and  recognized. 

Logically  based  system  reliability  analysis  appears  to  have  been  ini- 
tiated by  von  Neumann  in  connection  with  his  search  for  ways  of  explaining 
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how  large  digital  computers  having  many  thousands  of  unreliable  components 
such  as  vacuum  tubes  could  operate  reliably.  In  a paper  published  in  1956 
based  on  lectures  given  in  1952  at  California  Institute  of  Technology, 
von  Neumann  [59]  showed  that  by  the  use  of  redundancy,  it  is  possible  to 
build  and  maintain  a complex  system  having  a greater  reliability  than  any 
of  its  components. 

Moore  and  Shannon  [41]  used  essentially  parallel-series  circuits  con- 
sisting exclusively  of  idealized  identical  relays  all  with  identical  re- 
liabilities to  develop  functions  relating  the  reliability  of  a system  R 
to  that  of  the  components.  Bounds  on  this  relationship  were  developed 
showing  at  what  point  R could  be  greater  than  the  reliability  of  the  com- 
ponents. Mine  [40]  generalized  this  work  further  by  introducing  Boolean 
and  set-theoretic  notation,  and  employed  linear  graph  theory  to  find  the 
functional  relationship  between  system  and  component  reliability.  He  also 
developed  bounds  and  conditions  under  which  arbitrarily  high  R could  be 
obtained. 

Bimbaum,  Esary,  and  Saunders  [8]  extended  the  Moore-Shannon  approach 
to  the  general  class  of  "coherent11  systems  that  have  certain  logical  con- 
sistency properties.  Their  paper  introduced  terminology  which  has  been 
widely  used,  such  as  "minimal  paths",  "minimal  cuts",  and  essential  (rele- 
vant) components".  Esary  and  Proschan  [16,  17]  obtained  approximations 
for  the  reliability  of  coherent  systems,  including  upper  bounds  based  on 
the  minimal  paths  and  lower  bounds  based  on  the  minimal  cuts.  The  theory 
of  coherent  systems  based  on  this  general  approach  is  discussed  by  Barlow 
and  Proschan  [4,  pp.  1-51].  A review  of  the  literature,  with  special 
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reference  to  the  contributions  of  Z.  W.  Bimbaum  and  his  students  and 
colleagues,  is  given  by  Saunders  [52]. 

The  research  reported  upon  in  the  body  of  this  paper  deals  primarily 
with  generalized  approaches  that  apply  not  only  for  a complex  system  of 
any  configuration,  but  also  if  every  component  is  different.  Several  large 
scale  computer  programs  of  this  type  were  prepared  in  the  early  1960s  in 
connection  with  the  U.  S.  space  program.  Some  of  these  programs  employed 
Monte  Carlo  approaches.  As  a general  rule,  very  little  documentation  is 
available  as  to  what  theory  was  employed  or  what  was  done. 

A FORTRAN  software  package  with  the  acronym  SCOPE  (System  for  Com- 
puting Operational  Probability  Equations)  was  developed  about  1965  at  the 
Rockwell  International  Corporation.  SCOPE  provides  an  exact  system  relia- 
bility function  of  the  component  reliabilities  for  a complex  system  of  any 
configuration  but  of  limited  size,  based  on  either  the  minimal  paths  or 
else  the  minimal  cuts.  This  function  is  derived  by  the  method  of  inclusion- 
exclusion,  also  known  as  Poincare’s  theorem.  To  obtain  the  system  value, 
simply  substitute  the  component  values  into  this  function.  The  SCOPE  soft- 
ware is  available  through  NASA  [44],  and  the  mathematical  theory  is  given 
in  [28],  including  some  comparisons  with  the  Esary-Proschan  bounds. 

An  improved  version  of  SCOPE  was  prepared  in  1972  by  Burris  [11]  at 
Oklahoma  State  University  with  the  acronym  MAPS  (Method  for  the  Analysis  of 
the  Probabilities  of  Systems).  Instead  of  FORTRAN,  it  is  programmed  in 
PL/1,  which  is  better  adapted  to  binary-digit  manipulation.  It  also  Incor- 
porates a modularity  feature  so  that  the  system  can  optionally  be  processed 
as  a complex  configuration  of  independent  modules,  each  module  consisting 
of  a complex  configuration  of  independently  failing  elements.  Consequently, 
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there  is  simultaneously  both  an  increase  in  capacity  and  a saving  in  com- 
puter time,  over  the  requirements  of  the  parent  SCOPE  program. 

A further  extension  of  MAPS  is  SPARCS  (Simulation  Program  for  Assess- 
ing the  Reliabilities  of  Complex  Systems)  programmed  by  Cooley  [13]  at 
Oklahoma  State  University  under  Air  Force  Contract  F33615-74-C-4072.  This 
uses  Monte  Carlo  combined  with  Bayesian  techniques  to  assess  (provide  a 
schedule  of  confidence  levels  for  all  values  of  R between  0 and  1)  both  R 
and  the  MTBF  (mean  time  between  failures)  of  a complex  system  of  any  con- 
figuration consisting  exclusively  of  pass/fail  and/or  time-to-failure  com- 
ponents, or  any  mixture  of  these,  from  failure-history  data.  Both  MAPS 
and  SPARCS  are  described  in  [29].  A more  efficient  version  of  SPARCS, 
called  SPARCS-2,  has  been  prepared  by  Lee  [26]. 

A related  and  parallel  effort  to  SCOPE  and  its  daughter  programs  (or 
SPARCS  and  its  parents)  is  the  development  of  the  fault-tree  methodology. 
Initially  designed  for  aerospace  applications  at  Bell  Labs  [6]  in  1961, 
and  subsequently  also  at  Boeing  Airplane  Company,  as  an  aid  to  engineers 
in  analyzing  sequences  of  events  leading  to  system  failure,  it  has  recently 
been  used  extensively  for  reactor  safety  studies  by  the  Atomic  Energy  Com- 
mission and  its  successor,  the  Nuclear  Regulatory  Commission.  A recent 
but  historic  document  reporting  the  results  of  extensive  applications  of 
the  methodology  is  the  "Rasmussen  Report  WASH-1400"  [55],  particularly 
Appendix  II,  "Fault-Trees".  Descriptions  of  the  methodology  are  given  by 
Mearns  [37],  Haasl  [23],  Eagle  [14],  Schroder  [53],  Barlow  and  Lambert  [3], 
Vesely  [56,  57,  58],  and  Barlow  and  Proschan  [4,  pp.  264-266], 

A fault  tree  is  a Boolean-equivalent  diagrammatic  representation  of 
all  the  ways  of  failing  a complex  system  through  combinations  of  failures 
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and  repairs  of  one  or  more  components.  It  is  derived  from  block  diagrams, 
schematics,  and/or  blueprints,  and  logical  analysis  of  the  interrelation- 
ships of  the  elements.  "Minimal  cut  sets"  are  obtained  from  the  tree.  By 
substituting  the  component  values,  an  estimate  of  R,  the  system  availa- 
bility or  the  failure  rate  are  obtained.  A rather  substantial  library  of 
fault-tree  methodology  computer  programs  is  available,  including  programs 
which  build  the  trees,  find  the  cut  sets,  or  perform  numerical  evaluations. 
Vesely  made  some  major  contributions  with  the  development  of  the  PREP  and 
KITT  codes  [56,  57].  Salem,  Apostolakis,  and  Okrent  [51,  pp.  34-45]  and 
Worrell  and  Burdick  [62]  give  reviews  of  the  available  software. 

In  general,  the  fault- tree  methodology  incorporates  the  same  Boolean 
and  probabilistic  theory  that  SPARCS  and  its  parent  programs  do.  However, 
because  the  methodology  is  generally  applied  to  large  systems  having  only 
low  failure-rate  components,  various  types  of  "rare  event  approximations" 
are  employed,  to  save  computer  time.  These  include  both  Monte  Carlo  and 
deterministic  selections  of  components  and  cut  sets  according  to  importance, 
deleting  higher  order  intersection  terms  in  the  probability  equation,  and 
basing  calculations  on  failure  rates  rather  than  probabilities.  "Importance" 
measures  are  discussed  by  Nagel  [42],  Nagel  and  Schroder  [43],  Lambert 
[25],  and  Mazumbar  [36],  and  Barlow  and  Proschan  [4,  pp.  26-29]. 

This  article  is  a review  of  the  state  of  the  art  of  evaluating  R for 
a complex  system  as  a function  of  the  reliabilities  or  failure  rates  of 
the  elements.  It  develops  the  common  Boolean  theoretical  structure  which 
underlies  all  the  different  methodologies,  and  shows  some  of  the  similarities 
of  and  the  differences  between  exact  methods  and  approximations.  There  are 
two  parts:  Part  1,  on  "Logical  Formulation"  and  Part  2 on  "Probability 
Calculations" . 
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Part  1 of  the  paper  gives  the  logical  formulation,  starting  from  set- 
theoretic  first  principles,  from  the  viewpoint  of  lattice  theory.  We 
describe  the  universal  set  U of  system  states  and  the  power  set  ^ of 
lattices  or  events  which  are  subsets  of  U.  Events  such  as  system  success 
or  system  failure  are  collections  of  success  states,  called  "paths",  or 
else  failure  states,  called  "cuts",  described  by  lattice  Boolean  polynomials. 
The  terms  of  the  minimalized  form  of  the  "success"  polynomial  are  the 
"minimal  paths"  and  for  the  "failure"  polynomial  the  "minimal  cuts"  where 
each  term  denotes  a sublattice. 

Part  2 describes  how  to  calculate  the  system  reliability.  First  a 
probability  function,  a numerical-valued  function  of  the  component  proba- 
bilities, is  derived  from  the  lattice  polynomial  by  the  method  of  inclusion- 
exclusion.  Then  the  component  probabilities  are  substituted  into  this 
function.  Part  2 discusses  exact  methods,  error  bounds,  approximations, 
serializing  methods  and  the  fault-tree  methodology. 
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Part  1:  LOGICAL  FORMULATION 


BOOLEAN  ALGEBRA 

Components,  System  States,  and  the  Universal  Set 

A system  has  n zero-one  (binary-valued)  components,  each  representing 
in  reliability  terms  a potential  failure  mode  or  type.  The  value  ”0”  denotes 
a failure  condition  and  “l”  a success.  The  binary  representation  is  not 
meant  to  include  only  attribute  failure  modes.  Both  time-to-failure  com- 
ponents and  those  for  which  success  or  failure  is  defined  by  exceeding  or 
not  exceeding  a critical  value  of  a variable  can  be  included,  since  these 
too  can  either  be  failing  or  operating  successfully. 

A state  of  the  system  is  a binary  n- vector,  with  each  two-valued  com- 
ponent denoting  a value  for  the  corresponding  failure  type.  Since  there  are 
n components  and  two  choices  for  each  component,  there  are  2n  different 
n-vectors,  each  representing  a nunit  event”  or  ”simple  event”  which  is  a 
possible  state  of  the  system.  The  collection  of  these  vectors  is  called 
the  universal  set  U.  Those  n-vectors  which  represent  system  "success”  are 
called  "paths”  and  those  for  "failure”  are  called  "cuts”. 

The  Boolean  operations  of  addition,  multiplication,  and  inversion  are 
performed  on  the  n- vector  elements  of  U,  component  by  component,  in  the 
usual  way:  0+0=0;  0+  1=  1+  1=  1;  0 • 0=0  * 1=0;  1 • 1=  1; 

1=0,  0 = 1.  The  Boolean  sum  of  any  two  or  more  n-vectors  is  their  least 
upper  bound  l.u.b.,  and  the  Boolean  product  is  the  greatest  lower  bound 
g • 1 • b • 
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Let  X = (xj.,  x^),  Y = (y  ^ yn> , Z = (z^  zn)  , x^  y±i 

» 0,  1,  i ■ 1,  ...  n,  be  any  three  binary  n-vector  elements  of  U. 

Both  the  Boolean  operations  of  addition  and  multiplication  are  associative 
((X  + Y)  + Z = X + (Y  + Z),  (X  • Y)  • Z = X • (Y  • Z))  and  commutative 

(X  + Y = Y + X,  X*Y  = Y • X)  and  the  system  is  distributive  with  respect 

to  the  operations  of  addition  and  multiplication  ((X  + Y)  • Z = XZ  + YZ) . 
Since  there  is  a unique  "zero"  element  (0,  . ..,  0),  all  components  failed, 
a unique  "one"  element  (1,  1),  all  components  working,  and  all  Boolean 

operations  are  performed  on  any  two  or  more  n-vector  elements  of  U to  yield 
another  n-vector,  the  universal  set  U is  a Boolean  algebra  (see  Halmos 
[24,  p.  5,  40]). 

Comparison  and  Partial  Ordering 

The  universal  set  U is  partially  ordered,  with  the  £ operation.  For 
any  pair  of  n-vectors  X = (x^,  ...»  x^)  and  Y = (y^,  •••»  yfl) , X £ Y (X  is 
smaller  than  or  equal  to  Y)  if  X • Y = X (x^  = 1 **  y^  = 1).  This  also  im- 
plies that  X + Y = Y.  Equality,  strict  inclusion,  and  strict  noninclusion 

are  defined  as  follows,  respectively: 

X = Y:  X £ Y,  Y £ X 
X < Y:  X £ Y,  Y)<X 
X { Y:  X • Y < X. 


Subsets  and  Lattices 


A lattice  L is  a nonempty  (meaning:  it  contains  at  least  one  ele- 
ment) subset  of  U,  for  which  every  pair  of  elements  (i.e.,  binary  n-vectors) 
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has  both  a g.l.b.  Boolean  product  and  l.u.b.  Boolean  sum  (see  Birkhoff 
[7,  p.  6]).  In  the  probability  sense,  a lattice  is  also  known  as  an 
"event".  Since  every  pair  of  n-vectors  has  both  a g.l.b.  and  l.u.b.,  L 
also  has  a g.l.b.  Boolean  product  and  l.u.b.  Boolean  sum  for  all  the  n- 
vectors  in  the  lattice.  Both  the  g.l.b.  and  l.u.b.  are  members  of  U , but 
not  necessarily  of  L. 

The  differences  between  a lattice  and  a Boolean  algebra  have  to  do 
with  both  complementation  and  membership.  In  a lattice,  the  only  operations 
are  Boolean  addition  and  multiplication;  in  a Boolean  algebra,  there  is  also 
complementation  or  inversion.  With  respect  to  membership,  the  n-vector 
resulting  from  an  operation  or  sequence  of  operations  upon  any  two  or  more 
members  of  the  same  Boolean  algebra  is  in  the  algebra.  By  contrast,  for 
a lattice  L,  the  result  of  either  operation,  addition  or  multiplication, 
on  any  two  or  more  members  of  L does  not  necessarily  result  in  a member 
of  L. 

The  Power  Set 

The  set  of  lattices  generated  by  forming  subsets  of  U is  called  the 

2n 

"power  set"  & . For  a system  with  n components,  & has  2 lattices.  This 

can  be  rationalized  in  essentially  the  same  way  we  explain  why  U has  2n 

elements.  @ has  a set  of  subsets  created  by  forming  collections  of  n-vectors. 

For  every  lattice  in  @ and  every  n-vector  in  U,  there  are  two  choices: 

either  the  n-vector  is  in  the  lattice,  or  else  it  is  not.  Since  these  two 

choices  are  available  for  every  one  of  the  2n  n-vectors  in  U,  there  are 
2n 

2 lattices  in  Q. 
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Complete  Subsets:  Sublattices 


The  most  convenient  way  of  representing  subsets  of  U is  as  inclusive- 
or  unions  of  "sublattices"  or  "complete  subsets".  According  to  Rutherford 
[50,  p.  9],  a sublattice  S is  a lattice  which  includes  both  the  g.l.b.  and 
l.u.b.  for  every  pair  of  elements  in  S.  Let  X e S,  Y e S be  any  n-vectors; 
then  X+YeS,X*YeS.  Equivalently,  a sublattice  is  a lattice  which 
contains  both  its  own  g.l.b.  and  l.u.b.  and  every  n-vector  element  in  the 
interval  between.  Every  sublattice  is  characterized  by  m fixed-valued  com- 
ponents called  "indicators",  m n,  which  have  the  same  value  for  every  one 
of  the  n-vector  elements  in  S.  In  [34],  it  is  shown  that  there  are  3n  sub- 
lattices in  U. 


Example 


A system  has  two  components. 


The  universal  set  U has  2 


2 


= 4 


elements 


U = {00,  01,  10,  11}. 


The  power  set  & has 


16  different  subsets  of  U 


^ = {0,  {00},  {01},  {10},  {11},  {00,  01},  {00,  10},  {00,  11},  {01,  10}, 
{01,  11},  {01,  10},  {00,  01,  10},  {00,  01,  11},  {00,  10,  11}, 

{01,  10,  11},  U}. 


As  is  customary,  0 denotes  the  "empty  set"  (with  nothing  in  it).  There  are 
2 

3=9  sublattices  in  @ 


{00} 

{01} 

{10} 

{11} 

0 • = {00,  01} 
1 • = {10,  11} 
• 0 = {00,  10} 
• 1 = {01,  11} 
U. 
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The  Algebra  of  Sets  and  Lattices 


The  power  set  9 is  an  algebra  of  lattices.  Algebraic  operations  are 
performed  on  lattices  which  are  elements  of  9 to  form  other  elements  of  9. 
Somewhat  comparable  to  the  Boolean  operations  on  n-vector  elements  of  U, 
these  set-algebraic  operations  are  "inclusive-or  union  (|J,  V,  also  called 
’join1)",  "intersection  (n,  A,  also  called  'meet')"  and  "complement".  Let 
L^,  L^  be  any  three  lattices  of  Q.  The  union  and  intersection  of  opera- 
tions are  associative:  (L^  V L 2)  V V (L2  V L^) , (L^  A L2)  A = 

Lj  A (L^  A L^) ; commutative:  V = Lj  V L^,  A A L^;  and 

distributive:  (L ^ V L2)  A = (L^  A L^)  V (L^  A L^) . 

Let  L,  f , L be  any  m lattices  in  0,  m < 2^  „ The  union  .0,  L. 

I*  m y — j-1  j 

consists  of  all  n-vectors  in  at  least  one  of  the  lattices  L,  , . L . The 

1 ' 9 m 

m 

intersection  .fL  L.  consists  only  of  the  n-vectors  in  at  least  one  of  the 
J = 1 J 

lattices  L^,  . ..,  L^.  The  complement  L of  a lattice  L consists  of  all  n- 
vectors  in  U that  are  not  in  L. 


BOOLEAN  POLYNOMIALS  AND  MINIMALIZATION 

A lattice  subset  of  U which  is  also  an  element  of  9 is  characterized  by 
a Boolean  polynomial.  Each  term  or  monomial,  also  frequently  called  "min- 
term",  is  a collection  of  indicators  for  a sublattice  subset  of  the  lattice. 
The  polynomial  or  form  is  an  "inclusive-or"  union  of  these  sublattices.  It 
is  customary  to  employ  the  logical  "or"  (V)  symbol  to  separate  terms;  some- 
times "+"  is  used  instead  if  there  is  no  ambiguity  as  to  the  difference  be- 
tween logic  and  arithmetic. 

For  example,  a system  has  three  binary-valued  components:  x^,  x2,  and 

x0;  x.  denotes  that  x.  = 1 and  x.  = 0.  The  lattice 
3 1 1 1 
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L = V 

has  two  sublattices  x^  = {x^x^x^,  xix2x3>  xix2x3>  xix2x3^  anc^  X2X3  = 

{xjX2X^,  Xj^x^}.  This  representation  is  not  unique.  There  are  other 
sublattice  configurations,  including,  for  example: 

xlx2  V X1X2  V X1X2X3 
xlx3  V xlx3  V X1X2X3* 

Clearly,  because  it  is  shorter  and  has  more  coverage  per  term  and  the  small- 
est number  of  indicators  in  each  term,  "x^  V X2X3?I  better  than  any  of  the 
alternatives. 

The  "optimal"  or  "best"  representation  of  a lattice  is  a "minimal" 
form,  such  as  "x^  V X2X2"  in  the  above  example.  This  implies  the  largest 
coverage  in  each  term,  the  smallest  number  of  terms,  and  the  smallest  average 
length  of  a term.  In  general,  minimal  forms  may  or  may  not  be  unique. 

Quine  [46,  47,  48]  defined  the  scope  of  the  minimalization  problem  in  terms 
of  finding  all  the  so-called  "prime  implicants",  the  terms  of  all  the  dif- 
ferent alternative  minimal  forms.  In  a recent  series  of  papers,  Locks  [31, 
33,  35]  introduced  the  "reversal"  method  of  minimalization,  partially  based 
on  Quine’s  methodology,  without  necessarily  finding  all  the  prime  implicants. 

Coherent  and  Noncoherent  Systems  and  Minimal  States 

Birnbaum  et  al.  [8]  introduced  terminology  which  has  since  been  widely 
used,  such  as  "minimal  paths",  "minimal  cuts",  etc.,  in  the  context  of  a 
coherent  structure  which  has  certain  logical  consistency  properties  appro- 
priate for  system  reliability  analysis.  The  system  has  n zero-one  elements 
x^,,  i = 1,  . . . , n;  x^  = 0 denotes  component  failure  and  x^  = 1 success. 
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The  2n  n-vector  elements  of  U are  divided  into  two  major  lattices  or 
"events'1,  the  lattice  of  "paths"  L for  the  success  states  and  its  comple- 
ment, the  lattice  of  "cuts"  L for  the  failure  states. 

For  a coherent  system,  the  "zero"  element  (0,  0),  all  components 

failed,  is  a cut,  the  "one"  element  (1,  ...,  1),  all  components  successful 
is  a path,  and  no  path  can  be  smaller  than  a cut  (X  a path  and  Y a cut  =* 

X i Y).  Otherwise,  the  system  is  "noncoherent".  In  the  coherent  case, 
when  the  lattice  polynomial  L for  the  set  of  success  states  is  in  its  mini- 
malized  form,  every  sublattice  subset  of  L is  represented  only  by  one-valued 
components  and  no  sublattice  is  a proper  subset  of  any  other  sublattice. 

The  terms  of  the  minimalized  polynomial  for  L are  called  the  "minimal  paths" 
or  else  the  "minimal  path  sets".  Similarly,  the  minimalized  polynomial  L 
for  the  lattice  of  cuts  has  terms  containing  only  zero-valued  components, 
called  the  "minimal  cuts"  or  "minimal  cut  sets". 

For  example,  the  minimalized  lattice  of  paths  for  a two-component 
parallel  case  is 


L = a V b 

and  the  polynomial  of  cuts  has  one  term 

L = ab . 

This  is  an  example  of  coherence.  The  previous  example  is  noncoherent  since 

L = x^  V X2X^,  and 
L = xLx2  V x^. 

This  is  not  coherent  because  the  cut  state  Oil  contains  the  path  010.  While 
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the  literature  does  not  yet  explicitly  define  minimal  paths  and  minimal  cuts 
for  noncoherent  systems,  there  would  be  no  conflict  with  current  practices 
if  we  called  x^  and  minimal  paths  and  x^x^  and  x^x^  minimal  cuts. 

Minimalization  of  a coherent  system  is  simpler  than  for  a noncoherent 
one.  The  minimal  form  in  the  coherent  case  is  unique,  and  the  only  simplifi- 
cation operation  employed  is  absorption  (example:  abc  V ab  = ab) . By  con- 
trast, for  a noncoherent  system,  the  minimal  form  may  or  may  not  be  unique; 
also,  because  complementary  components  are  involved,  the  process  requires 
the  operations  of  merging  (example:  ab  V ab  = a)  and  redundant  indicator 
elimination  (example:  abc  V ac  = ab  V ac) . Cycling  is  also  needed  as  de- 
scribed in  [31]  and  [33]  to  test  for  both  minimality  and  uniqueness. 

The  literature  on  coherent  systems  and  the  financial  investment  that  has 
been  made  in  computer  software  for  computing  estimates  of  the  reliability  of 
coherent  systems  are  rather  extensive.  For  further  information,  refer  to  the 
Introduction.  Comparatively  little  theory  is  available  about  noncoherent 
systems.  However,  there  is  a computer  program  by  Worrell  called  SETS  (Set 
Equation  Transformation  System)  for  noncoherent  minimalization  in  the  con- 
text of  system  reliability  fault-tree  analysis  (see  Worrell  [61]  and  Atkinson 
[1]  in  [2]).  Noncoherence  arises  whenever  there  is  dependency  of  one  failure 
mode  type  upon  another.  Examples  of  potential  applications  of  noncoherence 
include:  maintenance,  test,  repair,  and  human  factors  problems,  the  so- 
called  "common  cause"  failures  [12,  15,  60]  and  [55,  Appendix  IV]  (failure 
modes  which  can  simultaneously  cause  failures  of  more  than  one  component), 
and  "phased  missions"  [18]  (missions  in  which  success  or  failure  in  one 
phase  depends  upon  success  or  failure  in  another  phase)  (see  also  Example  3 
in  [28]  and  Figure  1 of  Powers,  Tompkins,  and  Lapp  [45]).  Since  many  of 
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the  same  definitions  ("minimal  path",  "minimal  cut",  etc.)  seem  to  carry 
over  from  coherence  to  noncoherence,  and  since  the  common  linkage  between 
the  two  seems  to  lie  in  techniques  for  minimalization,  a further  develop- 
ment  of  the  theory  of  noncoherence  would  be  helpful  in  understanding  the 


properties  of  both  coherent  and  noncoherent  systems. 
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Part  2:  PROBABILITY  CALCULATIONS,  SOFTWARE, 

APPROXIMATIONS  AND  ERROR  BOUNDS 

INTRODUCTORY 

The  Boolean  foundations  of  system  reliability  analysis  are  presented 
in  Part  1.  The  transition  from  logic  to  probability  is  the  method  of 
inclusion-exclusion  (IE),  Both  exact  and  approximate  IE  methods  for  gener- 
alized complex  systems  are  described  in  Part  2,  based  on  lattice  theory  and 
the  minimal  path-minimal  cut  approach  of  Part  1.  The  universal  set  U is  a 
probability  space,  and  its  lattice  subsets  are  "events".  The  sublattice 
events  in  the  minimalized  lattice  of  paths  are  represented  by  the  "minimal 
path  sets"  and  those  in  the  minimalized  lattice  of  cuts  by  the  "minimal  cut 
sets".  IE  generates  a system  reliability  function  of  the  component  relia- 
bilities from  the  minimal  paths,  or  else  an  unreliability  function  of  the 
component  unreliabilities  from  the  minimal  cuts.  The  component  probabilities 
are  substituted  into  this  function  to  obtain  the  system  probability. 

System  reliability  analysis  by  exact  methods  frequently  requires  ex- 
cessive computer  time.  By  inverting  the  logic  function  or  portions  of  it, 
or  by  modularizing  the  system,  in  many  cases  the  computer  time  can  be  reduced. 
However,  in  large  scale  applications,  such  as  large  fault-tree  problems  or 
microminiaturized  logic  circuit  "black  boxes"  consisting  of  many  thousands 
of  elements,  cancellations  and  reductions  achieved  solely  in  the  context  of 
the  exact  methodology  may  still  not  be  enough.  Approximations  can  be  employed 
in  those  cases  if  the  probability  of  system  failure  is  small.  Part  2 covers 
exact  methods,  including  computer  software,  and  approximations,  including 
error  bounds. 
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DISCUSSION 


The  Universal  Set  as  a Probability  Space 

In  order  to  simplify  the  calculations,  assume  that  the  n components  all 
succeed  or  fail  independently.  Under  this  assumption,  the  probability  of 
the  unit  event  representing  each  n-vector  element  is  obtained  by  multiplying 
the  probabilities  of  success  (reliabilities)  of  the  one-valued  components 
by  the  probabilities  of  failure  (unreliabilities)  of  the  zero-valued  compo- 
nents. Let 


x_£  = 0,  1,1=1,  ...,n 

be  the  logical  value  of  the  i-th  component;  = 0 denotes  ,?failuren  and 
x^  = 1 denotes  "success”.  Also,  let 

r^,  0 < < 1,  1 = 1,  ...,  n 

be  the  component  reliability  or  probability  of  success,  and  1 - r^  the  com- 
ponent unreliability  or  probability  of  failure.  Let 


x ■ (xi x„> 


be  an  n-vector  element  of  the  universal  set  U* 


X-r 


pr  X = JJj  r±  1 (1  - ri) 


Then 

l-x± 


is  the  probability  of  the  n-vector.  It  is  shown  in  [28]  that  under  these 
assumptions  U is  a probability  space,  since  the  probability  for  each  element 
is  nonnegative,  the  sum  of  the  probabilities  is  one,  and  the  probability  of 
any  subset  is  additive  in  the  probabilities  of  its  mutually  exclusive 
elements . 
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Sublattice  Events 


The  simplest  case  is  a "sublattice  event".  The  probability  of  the 
event  is  the  numerical  product  of  the  probabilities  of  the  indicators,  those 
components  which  have  the  same  value  in  every  n-vector  element  of  the  sub- 
lattice. Let  S be  a sublattice  with  m indicators,  x^,  i = 1,  . ..,  m,  m <_  n; 
for  each  i,  r^  is  the  reliability.  Under  the  independence  assumption: 


= n 

i=i 


x.  1-x. 

r.  1(1  - r.)  1 


An  important  special  case  of  a sublattice  event  is  the  so-called 
logical  "series"  system.  In  this  case,  every  component  must  operate  for 
the  system  to  function  successfully.  The  universal  set  U has  a single  path, 
the  unit  event  (1,  ...,  1),  and  all  other  n-vectors  are  cuts,  because  the 
failure  of  only  one  component  is  all  that  is  needed  in  order  for  the  system 
to  fail.  Thus,  if  there  are  n components,  the  system  reliability  R is  the 
product  of  the  component  reliabilities 


R = r.r0  ...  r . 

12  n 

f,Series  reliability"  implies  "parallel  unreliability"  since  only  one 
component  failure  is  needed  to  induce  system  failure.  The  dual  concept  is 
"parallel  reliability";  if  only  one  component  works,  the  system  operates 
successfully.  By  a similar  reasoning,  parallel  reliability  implies  "series 
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unreliability"  since  the  failure  of  all  components  is  required  for  system 
failure  and  the  unreliability  R = 1 - R is  the  product  of  the  component 
unreliabilities  (1  - rp  (1  - r^)  ...  (1  - rfl) . 

Compound  Sub lattice  Events 

Compound  sublattice  events  are  obtained  from  combinations  of  unions, 

intersections,  and  complements  of  sublattices  and  are  represented  by  Boolean 

polynomials.  The  most  elementary  compound  event  is  an  "intersection",  since 

the  intersection  of  sublattices  is  also  a sub lattice.  The  intersection  L of 

sublattices  L,,  ...,  L 
1 m 


m 


b = f|  b . , 
j-1  3 


being  a proper  subset  of  all  the  L ^ , is  also  a sublattice  represented  by  all 

of  the  indicators  for  all  of  the  L • in  other  words,  the  set  of  indicators 

J 

for  L is  the  union  of  the  indicators  for  the  L ^ . For  example,  the  intersec- 
tion of  the  three  sublattices  x,x.,  x0x0,  x.x,xc  is  a sublattice  x.x^x^x.x.-. 

12’  23’  145  12345 

Under  the  assumption  that  all  components  are  independent,  the  probability  is 
rlr2r3r4r5* 

The  inclusive-or  union  V L2  of  two  or  more  sublattice  events  and 
L2  is  a lattice  and  might  be,  but  is  not  necessarily  a sublattice,  because 
the  lattice  V b2  may  or  may  not  contain  both  the  g.l.b.  and  l.u.b.  Since 
the  intersection  A L2  is  double  counted,  the  probability  of  the  union  is 

pr  bj  V = pr  Lj  + pr  L2  - pr  L ^ A b2* 

For  example,  under  the  independence  assumption 


pr  x^2  V = Pr  xix2  + Pr  X2X3  ~ Pr  X1X2X3# 
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Under  distributivity , a compound  event  is  factored  into  a Boolean 
polynomial.  For  example, 

(x,x0  V x0x0)  A x.x.x-  = x.x^x.Xj.  V x^.x.x.x.. 

1 2 2 3 145  1245  12345 

A further  reduction  is  possible  in  this  case.  Under  the  absorption  rule 
described  in  Part  1,  X1X2X3X4X5  a proper  subset  of  xix2x4x5#  Therefore, 
this  simplifies  to  just  x^x^x^x^  with  a probability  of  r^^r^r^. 

Inclusion-Exclusion:  Inclusive-or  Unions 

The  probability  function  for  the  inclusive-or  union  of  three  or  more 
sub lattice  events  is  built  up  recurrently  by  associating  sets  pairwise.  For 
three  lattices  L^, 

Lx  V L2  V L3  = (L j V L2)  V L3. 

Therefore, 

pr  v Lj  V = pr  V L2  + pr  - pr  (L^  V L2)  A 

■ pr  Lj  V l2  + pr  L3  - pr  (1^  V l3)  A (L2  V l3)  . 

For  the  three  sublattice  events  x^x2,  X2X3*  xlx4x5 

xlx2  V X2X3  V X1X4X5  = (X1X2  V X2X3)  V X1X4X5* 

The  probability  function  is 

pr  (xjX2  V x2x3>  v xix^x5 

= pr  XjX2  V x2x3  + pr  XjX.x^  - pr  (x^  V x2x3) A x^x,. 

- rlr2  + r2r3  - rlr2r3  + rlr4r5  ' 'iViV 


29 


The  generalization  of  this  process  is  the  method  of  inclusion-exclusion 
(IE),  also  known  as  Poincares  theorem  (cf.  Feller  [19,  p.  99]  and  Riordan 
[49]).  Let  the  lattice  L be  the  inclusive-or  union  of  m sublattice  events. 

Let  denote  the  sum  of  the  m probabilities  of  the  sublattices,  the  sum 
of  the  probabilities  of  the  intersections  taken  two  at  a time,  the 
sum  of  the  probabilities  of  the  intersections  taken  three  at  a time,  etc., 
and  the  probability  of  the  intersection  of  all  m sublattices.  Then 

pr  L = Sj  - S2  + S3  - ...  + (-1)  Sm>  (1) 

Minimal  States 

For  system  reliability  analysis,  L could  be  either  the  minimalized 
lattice  of  paths  and  its  terms  the  "minimal  path  sets",  or  else  the  minimal- 
ized lattice  of  cuts  with  terms  that  are  the  "minimal  cut  sets".  Note  that 
the  form  of  the  function  is  not  affected  by  whether  or  not  the  system  has 
the  coherence  property  with  all  common-valued  components.  Therefore,  it 
applies  equally  well  to  both  coherent  and  noncoherent  systems.  Since  L 
could  be  either  the  lattice  of  paths  or  else  the  lattice  of  cuts,  the  designa- 
tion "minimal  state"  is  used  in  the  sequel  for  the  terms  of  a minimalized 
polynomial,  whenever  it  is  possible  to  do  so. 

Computer  Programs 

Computer  software  is  available  to  generate  probability  functions  of  the 
form  of  Equation  (1)  and  to  calculate  the  system  reliability,  with  the  mini- 
mal states  and  the  component  probabilities  as  input,  but  only  for  coherent 
systems.  Three  programs  of  this  type  are  SCOPE  [44],  MAPS  [11],  and  SPARCS 
[13,  29],  all  of  which  are  now  available  as  "packages",  so  that  the  user 
need  supply  only  the  necessary  input  data. 
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Although  all  three  of  these  programs  are  virtually  indistinguishable 
in  the  way  IE  is  employed,  there  are  certain  differences.  For  example, 

SCOPE  is  a two-pass  FORTRAN  program,  with  the  function  generated  in  the 
first  pass  and  the  calculation  in  the  second  pass.  Both  MAPS  and  SPARCS 
are  one-pass  PL/1  programs,  with  all  operations  in  a single  pass.  MAPS  and 
SPARCS  also  have  a modularity  feature  so  that  the  system  can  optionally  be 
processed  as  a complex  configuration  of  independent  modules,  with  separate 
probability  calculations  for  each  module  and  for  the  system  as  a whole. 

Also,  both  SCOPE  and  MAPS  are  "estimation1*  programs  and  provide  merely  a 
single  value  of  the  reliability  of  the  system,  whereas  SPARCS  "assesses" 
the  system  reliability;  this  means  that  it  provides  a schedule  of  confidence 
levels  associated  with  every  potential  value  of  R between  zero  and  one. 

SPARCS  also  optionally  assesses  the  system  MTBF  (mean  time  between  failures). 

SPARCS  assesses  by  a combination  of  Bayesian  and  Monte  Carlo  techniques. 
The  components  are  all  assumed  to  be  either  attribute-type  or  else  time-to- 
failure  type,  or  any  mixture  of  these,  with  no  restriction  as  to  their  place- 
ment. The  component  reliabilities  are  subject  to  Bayesian  prior  distribu- 
tions that  are  functions  of  the  prior  history  data,  either  beta  for  attribute- 
type  components  or  negative-log  gamma  for  the  time-to-failure  case.  The  com- 
ponent values  are  sampled  from  these  prior  distributions  and  substituted  into 
the  IE  system  function  in  a series  of  repeated  Monte  Carlo  trials;  the  re- 
sult is  an  empirical  distribution  of  the  system  reliabilities,  from  which  the 
assessments  are  performed.  A newer  version  of  SPARCS,  called  SP ARCS-2  [26] 
incorporates  a "super  modularity"  feature.  It  is  modular  not  only  in  the 
sense  that  the  system  is  represented  as  a complex  configuration  of  indepen- 
dently failing  modules,  each  module  in  turn  being  a complex  configuration  of 
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independently  failing  components,  but  also  in  that  modules  with  minimal- 
path  reliability  functions  can  be  intermixed  with  minimal-cut  unreliability 
modules,  with  no  restrictions. 

Computer  Time  Management:  Keeping  the  Probability  Function  Small 

The  probability  function*  Equation  (1),  could  potentially  be  enormous, 
because  of  the  upper  limit  of  2m  - 1 terms.  For  example,  with  m = 10,  there 
could  be  up  to  1023  terms  representing  the  ten  minimal  states  and  all  the 
different  possible  sublattice  intersections  of  two  at  a time,  three  at  a 
time,  etc.  Since  the  maximum  increases  as  the  power  of  two  for  additional 
minimal  states,  this  could  be  beyond  the  capacities  of  the  fastest  and 
largest  computers.  Fortunately,  the  IE  probability  polynomial  almost  never 
gets  this  large;  in  most  practical  problems,  the  part  that  is  used  is  a 
tiny  fraction  of  the  maximum.  The  more  overlapping  components  there  are 
between  minimal  states  and  the  more  complementation,  the  more  terms  are 
cancelled.  Modularization  and  inversion  can  help  reduce  the  costs  even 
further. 

Overlapping  components  reduce  the  size  of  the  probability  polynomial 
because  most  of  the  higher  order  intersections  are  nonexistent.  Those 
higher  order  intersections  generated  by  IE  can  have  many  multiple  entries 
cancelling  against  each  other,  so  that  the  ultimate  size  of  the  polynomial 
is  a very  tiny  fraction  of  a 2m  - 1 maximum.  For  example,  Burris  [11]  re- 
ports a case  of  a coherent  system  with  33  partially  overlapping  components 
between  and  among  18  minimal  paths,  processed  by  the  MAPS  program,  with  a 
probability  polynomial  having  only  84  terms  instead  of  the  potential  maxi- 
mum of  2^  - 1 = 262,143. 
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Complementation  reduces  the  size  of  polynomial  by  cancelling  out  inter- 
sections of  complemented  sublattices.  This  can  be  done  only  for  noncoherent 
systems,  because  there  are  no  complements  in  a minimalized  lattice  form  with 
coherence.  Since  there  is  no  known  special  software  for  noncoherent  IE 
polynomial  generation,  there  are  no  examples  of  case  studies  of  experience 
with  complementation. 

Modularization  means  representing  the  system  as  a configuration  of 
modules,  generating  a separate  IE  polynomial  for  each  module,  and  another 
one  for  the  system  as  a configuration  of  modules.  This  not  only  eliminates 
most  of  the  irrelevant  higher  order  intersections,  but  also  saves  processing 
time  because  fewer  components  and  terms  are  processed  at  a time;  also,  the 
combined  sum  of  the  sizes  of  all  IE  polynomials  for  all  of  the  modules  is 
still  very  much  less  than  for  processing  as  a unit.  This  modularization 
feature  is  incorporated  into  both  MAPS  and  SPARCS.  Burris  [11,  p.  82]  re- 
ports that  for  33  components  with  18  minimal  paths,  when  the  system  is  rep- 
resented as  three  modules  in  series,  there  is  a combined  total  of  20  terms 
in  all  three  polynomials , all  of  them  much  shorter  than  the  84  terms  of 
the  probability  polynomial  for  the  integrated  system;  also,  the  processing 
time  was  reduced  from  9.09  seconds  to  1.5  seconds  on  an  IBM  360/65. 

"Inversion"  as  a form  of  computer-time  management  is  related  to  the 
fact  that  the  user  has  an  option  of  doing  either  minimal-path  reliability 
analysis,  or  else  minimal-cut  unreliability  analysis,  whichever  is  more 
convenient,  has  the  smaller  number  of  minimal  states,  requires  the  least 
computer  time,  etc.  The  minimal  cuts  can  be  obtained  from  the  minimal  paths 
or  vice  versa,  by  inverting  the  lattice  polynomial,  using  De  Morgan1 s theo- 
rems, and  minimalizing.  For  a coherent  system,  this  is  particularly  con- 
venient, since  the  only  simplifying  operation  needed  is  absorption.  For  a 
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noncoherent  case,  because  there  are  complementary  terms,  a more  complex 
form  of  minimalization  is  needed  such  as  the  reversal  method  discussed  in 
[31,  33,  35].  In  reference  [32],  a case  is  reported  of  a software  relia- 
bility analysis  problem  originally  due  to  Brown  and  Lipow  [10]  processed 
by  SP ARCS-2  with  artificial  data.  There  are  13  components.  One  module  has  . 
nine  minimal  paths  or  26  minimal  cuts.  The  IE  polynomial  for  the  minimal 
paths  has  91  terms  and  required  31  seconds  of  CPU  time  (370/158)  for  100 
simulations;  the  polynomial  for  the  minimal  cuts  has  421  terms  and  required 
152  seconds  for  100  trials. 

APPROXIMATIONS 

Because  the  reliability  R is  frequently  estimated  for  systems  on  the 
order  of  .95,  .99,  .999,  or  even  higher,  the  probability  of  failure  R = 1 - R 
can  be  very  small.  By  approximating,  we  may  either  overestimate  or  under- 
estimate R but  have  little  noticeable  effect  upon  the  estimate  of  R.  Ap- 
proximations are  based  on  the  idea  that  if  the  probability  of  a failure  for 
a single  component  is  very  small,  the  probability  of  having  simultaneous 
failures  of  two  or  more  components  may  be  insignificant  and  negligible. 

The  approximations  which  are  employed  include:  serializing  methods,  such  as 
"parts  count"  and  "single  point  failures" — these  effectively  delete  minimal 
cuts  with  two  or  more  components;  methods  that  delete  unimportant  components, 
cut  sets,  or  higher  order  intersection  terms  from  IE  generated  polynomials — 
this  includes  the  popular  "singles,  doubles,  etc."  in  the  fault-tree  method- 
ology which  does  a combination  of  all  of  these;  the  Esary-Proschan  bounds; 
and  error  bounds. 
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Error  Bounds 


An  approximation  implies  a willingness  to  accept  a reasonably  small 
error  in  exchange  for  computational  feasibility.  It  is  useful,  whenever 
possible,  to  set  bounds  on  the  amount  of  error.  These  bounds  depend  upon 
the  form  of  the  approximation.  Since  the  analyst  may  be  mixing  several  dif- 
ferent kinds  of  approximations  and  not  doing  any  one  of  them  in  a "pure" 
form,  but  basing  his  way  of  applying  each  type  on  arbitrary  numerical  cri- 
teria, there  is  no  single  error  bound  that  would  ordinarily  be  relevant  to 
a mixture  of  approximations. 

For  example,  consider  the  "singles,  doubles,  etc."  approximation,  vari- 
ations of  which  have  been  widely  used  in  "risk  assessment"  software  for 
fault  trees.  Each  software  package  does  it  a little  differently,  but  the 
standard  treatment  appears  to  be  as  follows:  first,  delete  all  components 
with  very  low  failure  probabilities,  then  all  cut  sets  of  two  or  more  com- 
ponents with  a combined  very  low  failure  probability,  then  three  or  more, 
etc.,  stopping,  for  example,  at  cut  sets  of  five  or  more  components.  Then, 
in  generating  the  IE  probability  polynomial,  delete  all  higher  order  terms 
beyond  a specified  number  of  intersections  or  below  a specified  probability, 
or  possibly  both.  Clearly,  no  generalized  error  bound  can  be  established 
for  a procedure  that  combines  this  many  different  types  of  approximations. 

The  best  known  and  possibly  most  useful  error-bound  model  is  Bonferroni's 
inequality  (see  references  [19,  p.  110]  and  [39,  p.  8]).  This  is  based  on 
truncating  the  IE  probability  polynomial.  Equation  (1),  by  deleting  higher 
order  intersection  terms  beyond  a certain  point.  Suppose  that  only  the  first 
r - 1 terms  of  the  right-hand  side  of  Equation  (1)  are  used,  viz: 

pr  L « Sj  - S2  + S3  - ...  + (-l)r"2  Sr_r 
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Then,  the  maximum  error  (true  value  minus  the  approximation)  is  smaller  in 
absolute  value  than  the  next  term  S^,  Note  that  because  the  polynomial  is 
an  alternating  series,  if  r is  an  even  number,  the  probability  is  under- 
stated, and  overstated  if  r is  an  odd  number.  Messinger  and  Shooman  [38] 
provide  some  Bonferroni-type  error  bound  calculations  for  examples  with 
idealized  components  all  having  common  probability  values,  following  the 
general  Moore-Shannon  approach. 

A conservative  error-bound  model  is  given  in  [27]  , based  on  deleting 
minimal  states  under  the  twin  assumptions  that  the  minimal  states  are  in- 
dependent (i.e.,  have  no  overlapping  components)  and  all  have  identical 
probabilities.  If  the  system  has  m minimal  states  and  m^  of  these  are  not 
accounted  for,  the  error  approaches  1 - exp  {-m^/m}.  Since 

x & 1 - exp  {-x},  for  x < .10, 

this  implies,  for  example,  that  if  up  to  ten  percent  of  the  minimal  states 
are  unknown  or  not  used,  the  maximum  error  in  the  estimate  of  the  system 
probability  is  approximately  equal  to  the  proportion  of  missing  states. 

This  model  overstates  the  error  for  most  practical  problems  because 
the  usual  practice  is  to  delete  only  states  (i.e.,  cut  sets)  having  very 
low  probabilities  and  to  retain  those  with  high  probabilities.  Likewise, 
since  the  minimal  states  are  in  general  mutually  dependent  (i.e.,  with  over- 
lapping components) , the  independence  assumption  causes  a larger  error  than 
would  be  obtained  in  practice.  On  the  other  hand,  this  model  provides  use- 
ful information  in  the  sense  of  the  potential  effect  of  missing  data.  For 
example,  suppose  a system  unreliability  study  is  performed,  and  the  analysis 
inadvertently  or  erroneously  yields  a fault  tree  that  does  not  account  for 
all  of  the  relevant  system  failure  modes.  Then,  based  on  the  possibility 
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that  the  unknown  cut  sets  could  have  a reasonably  high  probability,  the 
maximum  error  1 - exp  {-m^/m}  does  not  seem  to  be  all  that  conservative. 

For  further  discussion  on  the  potential  effect  of  missing  information, 
refer  to  Barlow  and  Lambert  [3] . 

Serializing:  "Parts  Count11  and  "Single  Point  Failure11 

Serializing  is  a form  of  approximation  which  has  a very  simple  model 
to  calculate  an  estimate  of  the  system  reliability.  The  simplifying  as- 
sumptions are  threefold:  (1)  the  system  is  logically  "serial" — the  only 
minimal  cut  sets  are  individual  failures,  and  all  components  are  needed  for 
system  performance;  (2)  the  components  all  have  extremely  low  (constant) 
failure  rates  with  all  individual  failures  governed  by  the  exponential  dis- 
tribution; and  (3)  all  components  have  the  same  duty  cycle. 

An  example  of  serializing  is  "parts  count"  (cf.  Bazovsky  [5,  pp.  90-91]) 
which  is  often  used  for  complex  circuits  consisting  of  very  many  parts  of 
several  different  types,  such  as,  for  example,  microminiaturized  electronic 
circuits  with  many  thousands  of  individual  elements:  so  many  diodes  of  each 
different  type,  so  many  transistors,  so  many  resistors,  etc.  Let  the  system 
consist  of  m different  types,  with  n^,  elements  of  each  type, 

respectively,  and  failure  rates  x^,  ...,  x . To  get  rid  of  the  "nuisance" 
parameter  "mission  time",  scale  time  in  "mission  equivalent"  units;  there- 
fore, the  failure  rates  x^  are  "failures  per  mission".  Under  the  exponential 
assumption,  for  each  type  i the  component  reliability  is  simply  exp  {-x^}. 
Because  of  the  very  high  reliability  assumption,  the  failure  rate  for  each 
type  is  approximately  equal  to  its  unreliability,  that  is, 

exp  {-x^}  « 1 - x^,  i = 1,  . . . , m. 
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The  system  reliability  is  approximately 


m 

exp  {-  S n,x.  } s=« 
i=l  1 1 


1 - 


m 

z 

i=l 


n.  x.  , 

l l 


m 

and  nixi  maY  be  used  interchangeably  as  the  approximate  system  unre- 
liability or  else  the  approximate  system  failure  rate. 

There  is  clearly  an  error  in  this  approximation,  but  we  cannot  tell 
in  advance  how  much  or  in  which  direction  without  a case-by-case  analysis. 

The  method  overstates  R in  the  sense  that  not  all  cut  sets  are  accounted 

for;  R is  also  understated,  however,  in  the  sense  that  the  approximate  system 
m 

unreliability  £ n.x.  is  for  all  practical  purposes  the  sum  of  the  first 
i=l  1 1 

order  terms  of  Equation  (1).  Because  all  terms  after  the  first  one  are 
deleted,  R = 1 - R is  overstated,  and  R understated. 

Another  serializing  method  similar  to  parts  count  is  "single  point 
failure  (spf)n  analysis.  The  assumptions  are  similar  to  those  in  parts 
count,  and  the  method  of  calculation  the  same.  The  difference,  if  any,  is 
one  of  degree;  spf  is  usually  applied  to  large  systems  having  relatively 
few  components,  perhaps  one,  two,  or  three  of  each  type.  Although  the  system 
is  complex  and  possibly  may  have  many  redundancies,  identify,  isolate,  and 
concentrate  attention  upon  the  "single  point  failures",  those  failure  modes 
which  could  individually  cause  failure  of  the  system  because  no  adequate 
backup  is  available.  Because  this  requires  detailed  study  to  isolate  the 
most  significant  potential  trouble  spots,  spf  should  be  viewed  more  as  a 
management  tool  than  a method  of  estimating  the  system  reliability. 


Approximating  the  System  Failure  Rate  and  R with  a Fault  Tree 

An  approximation  is  presented  in  this  section  which  has  been  widely  used 
in  the  fault-tree  methodology.  In  principle,  it  is  a form  of  serializing, 
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like  "parts  count".  Instead  of  serializing  on  the  components,  however,  the 
serializing  is  on  the  minimal  cut  sets.  For  a system  with  m minimal  cuts, 


let  x^,  i = 1,  ...,  m denote  the  net  failure  rate  of  cut  set  i,  that  is, 

the  contribution  to  system  failure  rate,  in  failures  per  "mission  equivalent" 

time  unit.  Since  for  each  component  the  failure  rate  and  the  unreliability 

are  approximately  the  same,  x^  is  the  product  of  the  failure  rates  of  the 

components  in  cut  set  i.  Accordingly,  the  system  failure  rate  is  approxi- 
m 

mately  E x . . As  was  shown  above,  if  time  is  in  "mission  equivalent"  units, 
m i=l  1 

E x^  is  also  approximately  equal  to  the  system  unreliability  R,  and  R is 
1 m m 

therefore  approximately  1 - E x^.  Since  E^  x^  is  an  approximation  to  the 

first  term  of  Equation  (1),  it  can  be  seen  that  this  method  overstates  R and 

understates  R.  The  one  advantage  it  seems  to  have  is  speed,  particularly 

for  hand  computations.  For  further  background,  refer  to  Fussell  [21]. 


The  Esary-Proschan  Upper  and  Lower  Bounds 


In  references  [16]  and  [17],  Esary  and  Proschan  present  an  approximation 
based  on  the  assumption  that  the  minimal  states  are  independent  and  have  no 
elements  in  common,  similar  to  those  incorporated  into  the  conservative  error- 
bound  model  in  reference  [27]  described  above.  The  independence  assumption 
is  equivalent  to  parallel-series,  parallel  in  the  minimal  states  and  series 
in  the  components  within  minimal  states.  It  also  overstates  the  probability 
which  is  being  approximated.  Let  the  system  have  m minimal  states,  either 

paths  or  else  cuts,  with  probabilities  x^,  i = 1 respectively.  The 

approximate  probability  is 

p = 1 - n a - x.). 
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For  example,  if  the  {x^}  are  minimal  paths,  P is  greater  than  the 
system  reliability  R;  this  is  called  the  "upper  bound  on  the  system  relia- 
bility". If  the  {x_^}  are  the  minimal  cuts,  P is  greater  than  the  unrelia- 
bility 1 - R;  1 - P is  called  the  "lower  bound  on  the  system  reliability". 
In  reference  [28]  , there  is  a discussion  of  the  upper  and  lower  bounds  and 
a comparison  to  exact  results  for  a five-component  bridge  circuit.  It 
showed  that  for  highly  reliable  components,  the  upper  bound  greatly  over- 
states R,  and  is  unusable.  The  lower  bound,  however,  appears  to  yield 
fairly  accurate  results — the  more  reliable  the  components  are,  the  greater 
the  accuracy. 
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SUMMARY 


This  paper  attempts  to  amalgamate  the  logic  of  system  reliability 
analysis  with  the  computational  aspects.  A system  reliability  function  is 
derived  by  inclusion-exclusion  from  the  minimal  states,  the  terms  of  a 
minimalized  lattice  polynomial  of  system  states.  Both  exact  methods  and 
approximations  are  covered.  The  theoretical  bases  of  several  widely  used 
approximations  are  discussed.  Related  topics  include  coherence  versus  non- 
coherence, error  bounds,  computer  software,  and  fault-tree  methodology. 
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ABSTRACT* 


The  SPARCS-2  program  for  system  reliability-  and  MTBF-confidence 
assessment  employs  Monte  Carlo  sampling  from  Bayesian  conjugate  prior 
distributions  (c.p.d.'s)  on  the  component  reliabilities.  Deviates  are 
generated  for  two  different  families,  "beta"  for  zero-one  attributes 
(success  or  failure,  pass-fail,  go-no  go,  etc.)  subject  to  a Bernoulli 
process,  and  "negative-log  gamma"  for  constant  failure-rate  times 
to  failure  subject  to  a Poisson-exponential  process.  In  both  cases, 
new  "rejection"  methods  are  used  which  are  at  least  as  accurate  as  and  more 
efficient  than  older  methods. 

The  beta  generator,  an  original  one  designed  by  Professor  J.  Chandler, 
employs  a modified  improper  Cauchy  function  as  an  auxiliary  distribution. 
The  gamma  generator  was  originally  designed  by  Marsaglia  [2] , and  it  is 
also  incorporated  into  the  McGill  Random  Number  Package.  A negative-log 
gamma  deviate  is  obtained  from  the  gamma  by  a change  of  variables,  since 
there  is  a one-to-one  relationship  between  the  two  distributions.  Both 
generators  were  programmed  in  PL/1  by  K.  K.  Lee.  The  beta  generator  is 
named  PROCEDURE  CABTA  and  the  gamma  generator  is  PROCEDURE  RGAMA.  The 
purpose  of  this  report  is  to  describe  these  procedures,  including  a dis- 
cussion of  the  theory  of  rejection  methods. 


This  report  is  based  upon  the  junior  author’s  MS  thesis  in  Computer 
Science  [3].  The  assistance  of  Professor  J.  Chandler  of  Oklahoma  State 
University  in  preparing  it  is  gratefully  acknowledged,  however  the  authors 
assume  all  responsibility  for  errors. 
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TECHNICAL  DISCUSSION:  INVERSION  AND  REJECTION  METHODS 


Inversion 

There  are  two  principal  classes  of  Monte  Carlo  sampling  methods  from 

continuous  distributions,  "inversion"  and  "rejection."  The  difference  is 

that  inversion  inverts  only  the  distribution  function,  whereas  rejection 

is  a more  complex  eclectic  technique  involving  the  probability  density 

function  (p.d.f.).  Rejection  methods  also  incorporate  two-step  Monte 

Carlo  and  frequently  include  inversion.  For  a random  variable  X with 

dF  f x 1 

d.f . F(x)  = pr(X  < x)  and  p.d.f.  f(x)  = — ^ — , inversion  may  generally 
be  described  as  follows:  Generate  a uniformly  distributed  pseudorandom 
variate  u,  0 < u < 1,  from  the  uniform  distribution  U(0,1),  representing 
the  d.f.  F(x).  Then  invert  to  obtain  the  corresponding  percentage  points 
x *=  F”*(u),  by  formula,  if  it  is  possible  to  do  so,  or  by  an  approximating 
polynomial  or  by  some  iterative  method  if  necessary. 

Familiar  examples  of  inversion  are  the  uniform  distribution  U(0,1), 
the  one-parameter  exponential  distribution  with  parameter  X,  and  the 
N(0,1)  normal  distribution  with  zero  mean  and  unit  variance.  For  the 
uniform  distribution,  the  process  is  trivial;  the  pseudorandom  variate 
u * F(u)  is  the  result  desired.  This  is  usually  obtained  by  a congruential 
generator.  For  the  exponential  distribution,  the  d.f.  is 

F(x)  = 1 - exp{-Xx} . 

With  u substituted  for  F,  the  corresponding  exponential  random  variate  is 

v = Mi  - u> 

X 

This  is  an  example  of  inversion  by  formula.  The  N(0,1)  normal  distribution 
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is  an  example  of  inversion  by  an  approximating  polynomial,  since  the 
Hastings  approximations  [1,  p.  191,  p.  192]  are  usually  used  for  this 
purpose. 

Other  examples  of  distributions  which  can  be  inverted  easily  are  the 
Weibull,  and  the  Cauchy  and  other  one-parameter  distributions.  The  Weibull 
can  be  standardized  easily,  and  the  one-parameter  distributions  are  in- 
verted by  formula.  Aside  from  these  examples,  however,  there  are  rela- 
tively few  cases  where  inversion  is  convenient  or  simple.  This  is  due  to 
the  fact  that  most  families  have  two  parameters  or  more  and  cannot  readily 
be  standardized.  Iterative  inversion,  which  is  employed  sometimes,  can 
be  a very  clumsy  and  time-consuming  process  because  of  the  difficulties 
of  converging  within  a reasonable  amount  of  computer  time. 

Rejection  Methods 

There  is  a variety  of  two-step  rejection  generators  based  on  the 
p.d.f.  For  each  deviate,  the  first  step  is  a trial  value.  This  trial 
value  is  either  "accepted"  or  "rejected"  in  the  second  step,  depending  on 
a second  pseudorandom  number,  so  that  the  values  accepted  are  in  proportion 
to  the  p.d.f.  height.  In  the  sequel,  we  discuss  a form  of  "geometric" 
rejection  with  an  approximating  or  auxiliary  distribution  at  the  first 
step  to  generate  a trial  deviate.  This  type  of  rejection,  which  is  exact 
in  the  sense  that  it  depends  upon  an  exact  relationship  between  the  auxil- 
iary distribution  and  the  parent  being  generated,  is  incorporated  into 
SPARCS  in  both  the  beta  generator  CABTA  and  the  gamma  generator  RGAMA. 

In  a simple  "geometric"  rejection  scheme,  the  parent  p.d.f.  f(x) 
is  "blanketed"  from  above  by  an  auxiliary  function  g(x),  g(x)  > f (x) , 


50 


where  g(x)  is  such  that  trial  random  deviates  can  easily  be  generated 
from  a p.d.f.  equal  to 


g(x) 

J_«  g(x)dx 

In  principle,  g is  a "little  taller"  and  a "little  fatter"  than  f,  so  that 
the  two  functions  do  not  intersect,  except  possibly  at  a logical  point  of 
contact,  such  as  the  mode  of  f.  G cannot  be  a "proper1'  distribution; 
because  g is  always  above  the  p.d.f.  for  the  proper  distribution  f 

G(°°)  = J g(x)dx  > 1. 

Parenthetically,  it  should  also  be  noted  that  one  way  of  obtaining  a trial 
deviate  is  to  invert  G,  that  is 

x - G”1  G(x) 


is  the  trial  deviate.  This  is  the  technique  incorporated  into  both  the 

CABTA  and  RGAMA  generators  in  SPARCS. 

The  procedure  is  as  follows.  Obtain  a trial  value  of  the  random 

deviate  x from  an  operation  on  G.  The  decision  as  to  whether  to  use  x 

f (x) 

is  based  on  a pseudorandom  number  U£.  If  U£  £ g(x)",  ',accePt"  x»  otherwise 
"reject"  it  and  repeat  the  process  again. 

A simple  example  of  rejection  is  with  a uniform  distribution  U(0,1) 
as  the  auxiliary  function  for  a beta  distribution.  Let  f(x),  0 £ x_<  1, 
be  the  p.d.f.  for  a beta  distribution  with  mode  XQ  and  density  f(xQ)  at 
the  mode.  Let  us  also  designate  f(xQ)  as  a "scale  factor"  representing 
the  "height"  of  the  uniform  distribution.  Since  the  auxiliary  function 
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is  uniform,  generate  a first  pseudorandom  number  uj  = x as  a trial  value, 
and  calculate  f (x) . Generate  a second  number  U£.  If  the  ratio 


f(x)  . 
f(x0)  - 


u2> 


x is  "accepted"  as  a random  deviate  under  the  beta  distribution,  otherwise 
it  is  "rejected"  and  another  random  number  is  drawn.  The  process  is  re- 
peated as  often  as  necessary  to  obtain  an  acceptable  deviate  x,  and  then 
again  as  many  times  as  necessary  for  the  specified  number  of  Monte  Carlo 
trials. 


The  "efficiency"  of  the  method  described  in  the  previous  paragraph, 
the  ratio  of  the  number  of  deviates  used  to  those  generated,  is  potentially 
very  low  for  highly  concentrated  beta  distributions  having  large  para- 
meters and  narrow,  tall  "spikes",  because  a large  number  of  deviates  would 
have  to  be  generated  to  obtain  acceptable  values.  The  efficiency  is  also 
surprisingly  high  for  diffuse  distributions  such  as  those  with  small 
parameters.  For  example,  if  f were  uniform,  the  efficiency  would  be  100% 
since  all  deviates  would  be  accepted.  Since  rejection  tends  to  have  low 
efficiency  and  computer  time  is  a major  consideration,  higher  efficiency 
is  attained  if  the  auxiliary  function  g and  the  p.d.f.  f are  reasonably 
close  to  one  another. 


The  rejection  generators,  PROCEDURES  CABTA  and  RGAMA,  incorporated 
into  SPARCS  are  both  relatively  efficient  because  the  g(x)  is  relatively 
close  to  f(x).  The  beta  generator  has  a shifted  and  scaled  improper 
Cauchy  function,  with  a mode  identical  with  that  of  the  beta  distribution, 
and  obtains  efficiencies  of  approximately  80%  in  most  cases.  The  gamma 
generator,  PROCEDURE  RGAMA,  employs  a "squeeze-down"  between  two  auxiliary 
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functions,  a normal  from  above  the  gamma  and  an  exponential  from  below, 
and  has  efficiencies  that  are  commonly  90%  or  better. 

THE  BETA  GENERATOR:  PROCEDURE  CABTA 

The  beta  generator,  PROCEDURE  CABTA,  employs  an  improper  Cauchy 
auxiliary  function.  The  modifications  are: 

1.  The  Cauchy  mode  is  shifted  (from  zero)  to  coincide  with 
the  beta  mode  pQ. 

2.  The  Cauchy* height  g(pQ)  at  the  mode  is  (1.1)  times  the 
beta  density  f (pQ) . 

3.  The  beta  normalized  second  derivative,  a measure  of  "curvature", 

f"<Po>  2 s"Cp0> 

p0  is  (1.1)  times  . 

These  modifications  help  insure  that  g(p)  > f(p)  in  the  vicinity  of  the 
center  of  the  distribution,  for  all  but  some  "worst  cases"  which  are  not 
relevant  to  the  applications  of  the  SPARCS  program.  The  general  procedure 
for  deriving  the  Cauchy  function  as  it  is  described  below  does  not  work 
if  the  beta  p.d.f.  has  a mode  at  either  p = 0 or  p = 1.  Fortunately,  a 
simpler  alternative  is  available  in  those  cases,  with  inversion  and  direct 
integration. 

The  Beta  p.d.f.  and  Mode 
The  beta  p.d.f.  is 

f(p)  = flaWT  p3~1(1  ~ p)W’  a > 1,  b > 1,  0 < p < 1.  (1) 

In  the  SPARCS  application,  this  is  a prior  p.d.f.  on  the  component  relia- 
bility p,  where  the  parameters  a and  b are  "sufficient  statistics"  for  the 
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prior  test  data:  a = number  of  successes  + 1;  and  b = number  of  fail- 
ures + 1.  The  specifications  a _>  1,  b _>  1 insure  that  the  distribution  is 
unimodal,  but  (uniform)  U(0,1)  only  in  case  there  are  "no  data".  The 
case  "no  failures  (b  = 1)"  results  in  a mode  at  p = 1,  with  f ( 1 ) = a. 

By  differentiating  Equation  (1)  and  setting  the  derivative  equal  to  zero, 
the  mode  can  be  obtained;  that  is 


e a - 1 

po  e a + b - 2 * 


(2) 


The  Improper  Cauchy  Function  g(p):  The  General  Procedure 


The  general  procedure  for  the  Cauchy  function  described  below 
differs  slightly  from  that  already  incorporated  into  the  SPARCS  CABTA, 
but  it  is  mathematically  simpler  and  obtains  similar  results.  The  major 
difference  is  the  fact  that  in  the  program,  the  second  derivatives  f"(p) 
and  g"(p)  of  the  beta  and  improper  Cauchy  densities  respectively  are  the 
measures  of  curvature,  whereas  in  the  description,  these  are  normalized 
by  dividing  by  the  densities.  Hence  — an<l  are  used  instead 

in  the  description  below. 

The  auxiliary  Cauchy  function  has  the  density 


g(p)  = c —z r , c > 0,  s > 0,  0 < p,  pc  < 1.  (3) 

s + (P  “ P0) 

The  parameters  are  a vertical  scale  factor  c,  a horizontal  scale  factor  s, 
and  pQ,  the  mode  of  the  beta  distribution.  First,  find  s by  the  relation- 

*"<Po>  2 8"<Po> 

ship  --|Yp-y  = (1.1)  ~g~(p0")  * aiK*  t^en  c ^ relationship 

g(pD)  = 1.1  f(pD).  The  formulas  for  s and  c,  which  are  given  below,  are 
very  simple.  The  mode  pQ  is  obtained  by  Equation  (2). 
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Under  the  beta  p.d.f.  (1),  the  "curvature"  at  the  mode  pQ  is 

f'X>  (a  + b - 2)3 

f (Po)  ' " (a  -1) (b  - 1)  • 

Under  the  Cauchy  density  (3) , the  curvature  at  pD  is 

g"<Po)  2 
g(pQ)  “ s * 


f"(Po)  2 8H(Po) 

Since  —tj — r-  = (1.1)  — — r-  , the  formula  for  the  horizontal  scale  factor 


f (Po) 


g(Po> 


s is 


a a 2.42  ».  . 

(a  + b - 2)3 


To  obtain  the  vertical  scale  factor  c,  we  have 


g(pQ)  = f = i-i  f(pQ); 


therefore 


c = (l.l)s  f(Po). 

In  CABTA,  the  first  step  in  obtaining  a beta  deviate  is  inverting  a 
pseudorandom  U(0,1)  number  under  the  Cauchy  function.  We  do  this  with  a 
"proper"  Cauchy  distribution  having  the  same  horizontal  scale  factor  s 
and  location  parameter  p , but  with  the  vertical  scale  factor  c = — . 

Under  these  conditions  the  trial  value  generated  by  the  first  pseudorandom 
number  u^,  0 < u,  < 1,  is 

p = p0  + s tan-1  [tt(u^  - %)]. 
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The  second  step  is  an  accept-reject  decision  for  the  trial  value  p. 
Generate  a second  pseudorandom  number  If  Uj  <_  g^pj’  "accept"  p as 

a beta  deviate  under  the  distribution  being  inverted;  otherwise  "reject"  p. 


Special  Case:  b = 1 

The  case  "b  = 1"  represents  the  Bayesian  prior  distribution  on  p 
when  there  are  "no  failures"  in  the  prior  component  data.  In  this  case, 
the  beta  density  f(p)  simplifies  because  (1)  becomes  a one-parameter 
distribution 

f(p)  = a pa 


By  integration,  we  have 


F(p)  = J q f (x)dx 


Therefore,  a value  of  the  variate  p can  be  obtained  by  direct  inversion  of 
a pseudorandom  number  u,  without  resorting  to  rejection 


P 


1/a 

u 


Limitations  of  CABTA 

Reference  was  made  above  to  a few  cases,  not  occurring  in  normal  SPARCS 
problems,  in  which  CABTA  might  fail.  These  ate  cases  where  the  quantity 

0 mm  1 

U j*  is  either  very  large  or  very  small.  It  should  be  pointed  out  that 

any  such  failure  causes  g(x)  to  lie  below  f(x)  at  some  point,  and  each  time 
a beta  deviate  is  generated,  this  possibility  is  tested  one  or  more  times. 

If  it  occurs,  a message  is  printed  and  the  run  is  terminated.  Thus  each 
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usage  of  CABTA  constitutes  a stochastic  test  of  its  validity  for  a partic- 
ular pair  of  values  (a,b).  It  has  never  failed  on  any  SlPARCS  problem  to 
date. 

The  possibility  of  failure  for  some  (a,b)  values  occurs  because  the 
beta  p.d.f.  may  become  sharply  pointed  (have  a large  curvature)  at  the 
mode,  without  becoming  narrow  at,  say,  its  half-maximum.  An  example  of 
this  is  a = 2,  b = 1.00001,  where  the  p.d.f.  is  almost  a right  triangle 
with  vertices  at  (0,0),  (1,0),  and  (1,2).  (The  case  a = 2,  b = 1 gives 
exactly  a triangular  p.d.f.,  but  this  is  handled  separately  with  no  pos- 
sibility of  failure.) 

It  is  believed  that  CABTA  can  be  modified  to  eliminate  any  possi- 
bility of  failure.  This  may  be  done  in  the  future. 

The  llarsaglia  "Squeeze"  Method  of  Generating  Gamma  Variates 

PROCEDURE  RGAMA  incorporated  in  SPARCS  is  based  upon  a rejection 
technique  called  the  "squeeze  method"  designed  by  Marsaglia  [2].  Since 
that  paper  is  being  published,  we  reproduce  herein  only  the  relevant 
paragraphs  from  Marsaglia' s paper  describing  the  procedure. 
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Refer  to  Figure  1,  which  shows,  for  several  values  of  the  parameter  a, 
three  functions  h(x).  t g(x)  5 f(x).  The  top  function,  f,  is  the  normal 
density.  The  method  is  to  choose  points  (X,Y)  with  a uniform  distribution 
under  f (x)  until  we  get  one  that  also  lies  under  g(x),  then  exit  with 
W ■ a(sX+l-s2)3,  where  s « a ^/3.  We  may  avoid  testing  under  g most 
of  the  time  by  first  testing  under  h.  The  functions  f and  h are  chosen 
to  be  close  to  g and  convenient  to  handle.  This  is  the  essence  of  the 
squeeze  method.  (f  is  not  very  close  to  g when  a < 1,  but  the  procedure 
is  still  reasonable  for  -j  < a < 1 because  h is  close  to  g. 


ALGORITHM  FOR  GENERATING  A GAMMA  VARIATE  V, 

DENSITY  wa~  e~W/T(a),  W > 0,  FOR  ANY  VALUE 
OF  THE  PARAMETER  a > 1/3 , BUT  RECOMMENDED 
FOR  all. 

Step  1.  Generate  a standard  normal  random  variable  X.  Put 
Z *=  eX  + 1-82j  where  e - a^/Z.  If  Z Z 0 repeat 
this  step. 

Step  2.  Generate  a standard  exponential  random  variable  E. 

Step  Z . If 

\X2  - E < + a(Z3-z^J  + (Za-1)  (t+\t2+~gt*) t then  exit 

with  W - aZ3, 
else  if 

|X2  - E < \x2  + aCZ3~z^J  + (Za~l)log(Z/z Q) t then  exit 

with  W - aZ3, 

else  go  back  to  Step  1. 

In  this  algorithm,  xQ  = s - /Z,  zQ  = 1 - sv'T,  t *=  1 - zQ/ Z. 
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